NMF-based speech enhancement incorporating deep neural network
نویسندگان
چکیده
Recently, lots of algorithms using machine learning approaches have been proposed in the speech enhancement area. One of the most well-known approaches is the non-negative matrix factorization (NMF) -based one which analyzes noisy speech with speech and noise bases. However, NMF-based algorithms have difficulties in estimating speech and noise encoding vectors when their subspaces overlap. In this paper, we propose a novel speech enhancement algorithm which uses deep neural network (DNN) to improve the encoding vector estimation of the NMF-based technique. A DNN is trained to represent the mapping from noisy speech to corresponding encoding vectors. The quality of the enhanced speech from the proposed NMF-based scheme adopting DNN-based encoding vector estimation is compared with that from the conventional NMF-based technique. The experimental results showed that the proposed speech enhancement algorithm outperformed the conventional NMF-based speech enhancement technique.
منابع مشابه
Investigating NMF speech enhancement for neural network based acoustic models
In the light of the improvements that were made in the last years with neural network-based acoustic models, it is an interesting question whether these models are also suited for noise-robust recognition. This has not yet been fully explored, although first experiments confirm this question. Furthermore, preprocessing techniques that improve the robustness should be re-evaluated with these new...
متن کاملStatistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization
This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech. A standard approach to speech enhancement is to train a deep neural network (DNN) to take noisy speech as input and output clean speech. Although this supervised approach requires a very large amount of pair data for training, it is not...
متن کاملDeep Recurrent Neural Network Based Monaural Speech Separation Using Recurrent Temporal Restricted Boltzmann Machines
This paper presents a single-channel speech separation method implemented with a deep recurrent neural network (DRNN) using recurrent temporal restricted Boltzmann machines (RTRBM). Although deep neural network (DNN) based speech separation (denoising task) methods perform quite well compared to the conventional statistical model based speech enhancement techniques, in DNN-based methods, the te...
متن کاملCombining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary Noise
We address the speaker independent automatic recognition of spontaneous speech in highly variable noise by applying semisupervised sparse non-negative matrix factorization (NMF) for speech enhancement coupled with our recently proposed frontend utilizing bottleneck (BN) features generated by a bidirectional Long Short-Term Memory (BLSTM) recurrent neural network. In our evaluation, we unite the...
متن کاملA DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement
General speaker-independent models have been used in nonnegative matrix factorization (NMF) based speech enhancement algorithms for the practical applicability. And additional regulation is necessary when choosing the optimal models for speech reconstruction. In this paper, we propose a novel utilization of deep neural network (DNN) to select the models used for separating speech from noise. Sp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014